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ABSTRACT 

We  describe  a  numerical  procedure  for  solving  problems  involving  mmimi- 
zation  over  the  rotation  group  of  quadratic  forms  which  arise  in  connection  with 
problems  of  computer  vision.  The  algorithm  presented  is  a  sequential  quadratic 
programming  method  which  takes  advantage  of  the  special  structure  of  the  prob- 
lem constraints.  We  conclude  by  proving  that  our  method  is  globally  convergent. 
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1.  Introduction 

The  classical  theory  of  eigenvalue  problems  grows  out  of  a  quadratic  minimization  problem, 
to  wit:  minimize  one  quadratic  form  x  K  iX  while  holding  another  quadratic  form  i  K nx  con- 
stant. (Here  it  is  best  to  assume  that  A'o  is  positive  definite).  This  simplest  observation  makes  it 
clear  that  the  eigenvalue  problem  is  the  most  rudimentary  form  of  a  more  general  problem, 
namely: 

(1)  minimize  the  quadratic  form  x^  Kx  while  holdmg  n  other  quadratic  forms  x -^  A',  j 
constant. 

Introduction  of  Lagrange  multipliers,  as  in  the  standard  eigenvalue  case,  reduces  this  to  an 
algebraic  problem  having  a  'generalized  eigenvalue'  form:    solve  the  equations 
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(2)  (K-\iKi--  ■  ■  -\i,K,)x  =0,     x^KiX=Ci,     i=l,...,k 

However,  only  inefficient  numerical  procedures  for  solving  algebraic  systems  of  this  generality  are 
known,  nor  is  any  systematic  theoretical  analysis  of  problems  of  this  form  available. 

Several  minimization  problems  of  this  class  having  to  do  with  minimization  over  the  rota- 
tion group  of  quadratic  forms  arise  naturally  in  connection  with  problems  of  computer  vision. 
The  general  problem  of  this  somewhat  more  restricted  form  can  be  written  as 

3 

(3)  minimize         Y^    «.-^A',y  uy 

over  all  mutually  orthogonal  sets  of  unit-length  vectors  (u,,Uo,«3),  where  the  coefficient  matrices 
K^-  satisfy  AT,-.-  =  iv.'.  The  present  report  describes  a  numerical  procedure  for  solving  a 
significant  case  of  the  quadratic  minimization  problem  (3),  namely  that  in  which  the  off-diagonal 
matrices  K^j  all  vanish,  reducing  the  problem  to 

(4)  minimize        f  (u  ,v  ,w  )  ^  u  ^  Au   +  v^  Bv   +  w"^  Cw 

subject  to         u  ■^  u    =  1 
v"^  V    =1 

WW     =1 

«  ^t;  =  0 
u  ^  u)  =  0 
v^  u;   =  0 

when  the  matrices  A  ,  B  ,  C  are  all  symmetric.  Note  that  by  adding  an  appropriate  positive  mul- 
tiple of  u'^+v'^+w'  to  the  form  (4),  we  can  suppose  that  the  matrices  A  ,  B  ,  and  C  are  also  posi- 
tive definite. 


2.  Description  of  the  Algorithm 

The  problem  under  consideration  is  a  special  case  of  the  general  equality  constrained  non- 
hnear  programmmg  problem 
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(5)  mm   /  {x  ) 

s.t.    c,  (x  )  =  0,     «   ^  1,  ■  •  •  ,m. 

Let  g  denote  the  gradient  of  the  objective  function  and  let  A  be  the  matrix  whose  rows  consist  of 
the  transposes  of  the  gradients  of  the  constraint  functions.  Let  G  and  G,  denote  the  Hessians  of 
the  objective  and  constraint  functions,  respectively.  We  now  define  F  to  be  a  matrix  with 
orthogonal  columns  spanning  the  range  space  of  A^  and  Z  to  be  a  similar  matrix  whose  columns 
form  a  basis  for  the  null  space  of  A  . 

The  first-order  necessary  conditions  for  (u  ,v  ,w  )  to  be  a  local  minimum  of  (5)  are  that 
the  constraints  are  satisfied  and  that  there  exists  a  vector  of  Lagrange  multipliers,  X    ,  such  that 

(6)  g  {u  '  ,v  '  ,w  ' )  =  A  '^  \' . 

This  condition  is  equivalent  to  requiring  (u  ,v  ,w  )  to  be  a  stationary  point  of  the  Lagrangian 
function 

m 

L  {u  ,V  ,W  ,\)  =   f   {u  ,V  ,w)  -    Y]  ^i  '^i  ("  i^'  ."•'  ) 

1=1 

when  X  ^  X' .    Alternatively,  this  condition  can  be  stated  as  Z     g   ^  0. 

Second-order  optimality  conditions  are  expressed  in  terms  of  the  Hessian  with  respect  to 
(u  ,v  ,w  )  of  the  Lagrangian  function,  which  is  defined  by 

m 
W  (u  ,V  ,W  ,X)  =   G  (u  ,V  ,w)  -    Yj^i  ^i  ("  '^  '"'  )■ 

l'  =1 

The  second-order  necessary  condition  for  (u  ,v  ,w  )  to  be  a  solution  of  (4)  is  that  the  projected 
Hessian,  Z (u  '  ,v  '  ,w  ' )  W (u  '  ,v  '  ,w  '  ,\'  )Z {u  '  ,v  '  ,w  ' ),  be  positive  semi-definite;  strict  posi- 
tive definiteness  of  the  projected  Hessian  together  with  the  satisfaction  of  the  first-order  condi- 
tions is  a  sufTicient  condition.    (See  Fletcher  (1981)  for  more  detail.) 

Algorithms  based  on  sequential  quadratic  programming  have  been  found  particularh' 
efi'ective  for  solvmg  problems  of  this  nature.  Methods  of  this  type  approximate  the  minimizer  of 
(5)  by  solving  a  sequence  of  quadratic  programming  (QP)  subproblems.    In  these  subproblems  a 
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local  model  of  the  nonlinear  problem  is  posed  in  which  the  curvature  of  the  Lagrangian  function  is 
approximated  by  a  suitable  matrix  and  linear  approximations  are  made  to  the  constraints.  The 
solution  of  the  k  th  QP  is  used  to  provide  a  search  direction  p  * .  The  algorithm  then  determines 
a  step  length,  o    ,  such  that  x      +  a*  p  *   is  a  sufficiently  better  approximation  to  2  ' . 

The  QP  subproblem  for  (5)  is  given  by 

min  It  p  ^  Wp   +  p^  g 
pen' 

s.t.      Ap   =  -c 

The  solution  can  be  easily  computed  by  solving  two  systems  of  linear  equations.  The  solution, 
p    ,  can  be  written  as 

p    =  yp„  +  Zp. 

where  >'  and  Z  are  defined  as  described  above.    The  vectors  p^'  and  p.*  can  be  found  by  solving 
(7a)  AYp^' = -c 

(7b)  Z  ^  WZp/  =-Z'^  g  -  Z  ^  UTp/ 

The  solutions  are  unique  when  .4  has  full  row  rank  and  Z  ^  \VZ  is  positive  definite.  This  method 
of  computing  p  *  is  discussed  by  Murray  and  Wright  (1982).  The  solution  of  the  QP  subproblem 
can  also  be  viewed  as  the  result  of  applying  one  step  of  Newton's  method  to  the  nonlinear  system 
of  equations 


<?(^.X) 


g(x)-A(xf\ 
c(x) 


=  0. 


(see  Nocedal  and  Overton  (1985)). 

Returning  to  problem  (4),  we  note  that  the  solution  is  not  unique,  since  any  of  the  vectors 
u  ,  V  and  w  can  be  replaced  by  its  negative.  We  will  denote  the  «  th  constraint  by  r,  (u  ,v  .w  ). 
For  this  problem  we  have 


9   = 


2  An 

2Cw 


and 


A   = 


2u  0  0  V  w  Q 
0  2i'  0  u  Q  w 
0      0     2w   0    u     t; 


When  the  constraints  are  satisfied,  Z  can  be  given  by 

V     Q     w 
Z  ^     -u    w     0 

0     -V   ~u 

Sequential  quadratic  programming  methods  typically  approximate  the  Hessian  matrix  W  or  the 
projected  Hessian  Z^  WZ  by  using  quasi-Newton  updating  (for  example,  Gil!  et.  al.  (l985a,b), 
Powell(1978),  Han(1976),  Gurwitz(l986)).  However,  since  the  second  derivative  matnces  {G,  }  are 
readily  available  for  problem  (4),  the  algorithm  can  form  Z     WZ  directly. 

Since  A     \  ^  g   from  (6)  ,  we  have 


u'^  Au 

v'^  Bv 

w     Cw 
{v'^  An    +  ti^5t') 
(w^  Au    +  u'''  Cw) 
(w^  Bv   +  v'^  Cw] 


and  the  projected  Hessian  of  the  Lagrangian  function  has  the  form: 


Z  ^  WZ  = 

2v'^  Av-2u  "^  An  +2u  ^5u-2i'^j9i' 
u'^,4«  +u  ^  Cw-2u  '^  Bw 
-w  '^  Bv-v'^  Cw  +2w  ^  Av 


w'''  Au+u'^  Cw  -2u  ^  Bw  -wtBv  -v  ^  Cw  ^2v  ^  Aw 

2u'^Bu'-2r^Bf  +2v'''  Cv-2w^  Cw  -v'''  Au  -u  '^  Bv  ^2v'^  Cu 

-v'^  Au-u'^  Bv+2u  '^  Cv  2w'^  Aw-2u'^  Au  +2u  ^  Cu-2w'^  Cw 


A  special  feature  of  problem  (4)  is  that  it  is  relatively  easy  to  ensure  that  the  constraints 
remain  satisfied.  If  the  new  iterate  does  not  satisfy  the  constraints,  the  vectors  u  ,v  ,  and  w  can 
be  orthonormalized  by  applying  the  Gram-Schmidt  process.  Thus  c  will  always  be  zero,  and  the 
Py  term  defined  by  (7a)  can  be  ignored.  The  Newton  step  is  thus  defined  simply  by 

Z  ^  WZp,   =-Z^  g 
P   =  Zp,  . 


We  present  an  SQP  method  for  solving  (4),  in  which  we  take  advantage  of  the  special  struc- 
ture of  the  constraints.  The  method  we  propose  generates  a  sequence  of  feasible  points  which 
converge  to  a  stationary  point  of  (4).  If  the  projected  Hessian  is  not  positive  definite  at  the  solu- 
tion, we  introduce  a  perturbation  in  order  to  move  away  from  a  saddle  point.  However,  since  the 
projected  Hessian  could  be  singular  at  the  solution,  this  is  done  only  a  few  times. 

START  with  u°,v°,w°. 
We  will  use  the  notation  [u  ,v  ,w)  to  denote 
Choose  i  to  be  some  small  positive  value. 


Set  count    =  0. 


Evaluate  Z*  ,  Z*  ""  y  *  ,  \*  ,  Z*'"vr*Z*. 


WTilLE  (    I    I  Z*    5*   I    I  >€  or  Z*     W*' Z*"  is  not  positive  definite  and  coun^  < 4 


DO 


Evaluate  Z*,  Z*%*,  X*,  Z'''^  W''  Z''  . 


Compute  the  modified  Cholesky  factorization  of  Z*     U'*  Z* 


(This  in  effect  produces  a  factorization  of  Z*     VV*  Z*  +£■*  ,  where  E^ 
is  a  diagonal  matrix  which  is  added  so  that  the  resulting  matrLx  is 
positive  definite  (see  Gill,  et.  al.  (1981).) 


IF  Z*  ^  U'*  Z*   is  not  positive  definite 


THEN  set  count   =  count  +1 


W    I    I  Z*    (7*   I    I  >£ 


THEN  set  the  r.h.s.  to  Z*    g^ 

ELSE  set  the  r.h.s.  to  a  random  positive  vector 
Solve  (Z*  '"  W"  Z*  +£•*  )p/  =  -r.h.s. 
Set  p*  =  Z*p/ 
Set  Q  =  1 

REPEAT 

Set  (u-^,v-^,w*)  =  (u*  ,1;*  ,u)*  )  +  ap* 

Use  Gram-Schmidt  to  orthonomalize  {u'^,v'^,w'^) 

Set  a  =  a/2 

UNTIL  (  /  {u^,v*,w+)</  («*,t;*,u;*)  ) 

Set  Q*   =  2q 

Set  (u*+',f*+', «;*  +  ')  =  («+,«+,«;+) 

Set  k   =k  +1 


Evaluate  Z* ,  Z*""./*,  X*,  Z*''h^*Z* 


ENDWHILE 

In  the  context  of  image  processing,  Problem  (4)  arises  naturally  in  three  dimensions.  Note, 
however,  that  the  algorithm  can  be  used  to  solve  the  general  problem  in  any  number  of  dimen- 
sions.   The  general  problem  b 
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min     f  (xi,X2,  ■  ■  ■  .Xn)  =  xj  AiXi  +  12'^  A2X'2+    ■■■    +x„'^A„x„ 

S.t.     Xi  ^  X{    =1 

Xi'^Xj      =0  iy^j 

and  the  matrices  A,    are  all  positive  definite.    The  problem  then  has  n^  unknowns  and  — ^ 

constraints.    The  matrix  Z ,  whose  columns  form  a  basis  for  the  null  space  of  the  constraint  gra- 

dients,  has  columns.  The  algorithm  therefore  solves  — - —  X  — - —  systems  of  equations 

at  each  iteration.  A  nice  feature  of  the  two-dimensional  case  is  that  the  Newton  step  satisfies  the 
orthogonality  constraint,  so  that  the  new  iterate  can  be  found  simply  by  normalizing  the  Newion 
step.  In  the  following,  we  will  restrict  our  attention  to  the  original  problem  as  posed  in  R '. 
Clearly,  a  similar  analysis  can  be  used  to  prove  convergence  in  R"  . 


3.  Global  Convergence 

The  algorithm  generates  a  sequence  of  feasible  points.    Thus  to  prove  convergence  to  a  sia- 
tionarj'  point  of  problem  (4),  it  is  sufificient  to  show  that    lim    |    |  Z*      (;*    |    |    — '  0.    We  show 

that  the  objective  function  decreases  initially  from  the  current  point,  (u  *  .f*  ,u'' ).  along  the 
direction  p  .  We  travel  along  this  direction  to  a  new  point  (u*,v^,w'^)  and  apply  Gram- 
Schmidt  to  {n^,v^,w*)  to  generate  (u  *"^',i'*  "^\w;*  "^')  which  is  a  feasible  point.  The  step  along 
the  direction  p  *  can  be  chosen  so  that  /  («  *  "^',d*  "^^w*  "^'j  <  /  (u*  ,d*  ,u'*  ).  We  show  that  the 
step  to  the  constraints  does  not  impede  descent  and  that  the  step  length,  q*  ,  does  not  become 
arbitrarily  small.  Finally,  we  show  that  in  the  limit,  the  projected  gradient  tends  to  zero  and 
thus  the  algorithm  is  globally  convergent. 

Any  direction  along  which  the  objective  funtion  decreases  initially  is  called  a  descent  direc- 
tion.   If  we  consider  the  local  Taylor  series  approximation  to  /  ,  it  is  clear  that  if  g^  ^  j>^  <0- 

then  p      is  a  descent  direction. 

Lemma  1:  p  *   is  a  descent  direction  for  /  . 
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Proof:  By  definition,  (Z*  ^  H^*  Z* +£■*  )p/  =  -2'*^j*.    Thus 

Since  Z*     H'*  Z*  +£■*  is  positive  definite,  j*    p*  must  be  negative  if  p/7^0,  i.e.,  p*  7^0. 
Let  p/  =  (a  ,6  ,c  )^  and  /   =  max(  |  a    |  ,  |  6    |  ,  |  c    |  ). 
Lemma  2:  At  (u'*' ,v'*' ,w'*')  the  constraint  violations  are  O  (a*  p*  ). 
Proof: 

«*u^  =  l4-a*a*  +  Q*c- 

r^  =  -Q*  a   +  a^  a    +  q*'6c 
u;^=-Q*6    +  Q*  6    4-  Q*'ac 
«       w  ^  ^  Q    c    -ate    +  Q    c 

Lemma  3:  The  step  from  {u'^ ,v'*' ,w'*')   to   the  constraints,  i.e.,  the   change  given  by  the 

,0  ,2 
Gram-Schmidt  process,  is  O  {a    p    ). 

Proof:  The  new  iterates  are  given  by 


«*-'  = 


« "*"    V 


r  "*■    w 


t;*+'  = 


*  +1.,  +^,.  *  ^1 


(«*^'i.")u 


„+_(u*+i\+)„' 


t;+- 

11"*- 

(«+^r+)«^ 

«-^«-        '    ' 

«■*-' 


I    \w^  -(m'^'^  w^)u'^'  -{v'^'^  w^jv'^' 
The  size  of  the  step  taken  to  the  constraints  is  thus 


10- 


«*+'-«  + 


=  (""-  I  l%l  I )'("'-  I  l\l  I ) 


«^^«+-2«+^ ^ 


+  1 


=  V{l-\    \u^\    \f  =  O  (a*  V)   by  Lemma  2. 


I    |t;*+'-v-'|    I    =  v^t,+^t;+-2i;+^v*+'  +  1 


T  T 


{u+'  V- 


.-\- 


2(u 


.T     ^2 


r       2      r 


+  1 


,+r     ,       2(«+%+)2 


V         V       - 


+    1 


=      v^    v^  -2cv*    v^  -27- 


-t-  1 


where 


T  2 

2u^    v^ 


(«^^•+l 


Thus,  F  =  0  (1  ^  4-q'' V)  .    It  follows  th 


at 


v'^'  -v^ 


=  v/(l  -    I    I  r  '  I    I  )-  -  O  (a*  V')  =  O  (a*'p*')  a^  above. 


Similarly,  it  can  be  shown  that 


,*+i 


«.  ••■•-„,•*•  I    1    =  \/u'  +  ^u'+  -2«''"^w*+'  -  1 


=  V^(l  -    I    I  "'^  I    I  )'  +  0(a*V')  =  0(q*VT 
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Given  that  the  step  to  the  constraints  is  O  (q*  p*  ),  it  follows  that 
f  {u'-'\v'*\w'^')=f  {(u\v\w^)+  0(q*V')) 

=  /  («*,r*,u;*)  +  aS*^P*+0(a*'p*'). 
The  method  of  computing  the  factorization  guarantees  that  Z*     W^  Z*  +£"*    is  bounded 
away    from    singularity.      Let    r;*     and    <;*     be    the    minimum    and    maximum    eigenvalues    of 
Z*  ''  H'*  Z*    +  £■*  ,  respectively.    It  follows  that 

n'  I  U*%*  I  I  <  I  |p/l  I  <  ^^  I  U'%*  M.  ■" 

Thus,  from  the  proof  of  Lemma  1  it  follows  that  o*  j  *    p*    is  a  negative  term  of  order  a*p*'. 
Therefore,  there  must  exist  an  q*  such  that  /  (i/*"''',t' *"*"',«)  *'^')</  (u*,t)*,u'*). 

Lemma  4:  q     is  bounded  away  from  zero. 

Proof:  Since  the  method  generates  only  feasible  points, 

I    I  «*   i    I  =  I    I  ''^    I    I  =  I    I  «'*   I    1=1- 
As       noted        above,         I    I  p/  I    I         is       bounded.         Moreover,       we       note        that       since 

ff*=(2.4u*,  2Bi'*,  2Cu'*)^,    lU*   I    I    <  2x/3A/,  where  A/  is  the  largest  condition  number  of 
the  matrices  A  ,B   and  C  .    Thus,  p     is  bounded.    Clearlv,  if  we  choose  a*    <  where  K  is  an 

appropriate  constant,  we  obtain  a  decrease  in  /  .    Since  p     cannot  become  arbitrarily  large,  a*   is 
bounded  away  from  zero.    Note  that  when  p*   is  sufficiently  small,  we  can  take  o*  =1. 

Theorem:  If  (u°,v°,w°)  satisfy  the  constraints,  then  either  the  sequence  (u  *  ,f  *  ,u;*  )  gen- 
erated by  the  method  terminates  at  a  stationary  point  of  problem  (4)  or 

lim   inf   |    |  Z*'"?*   |    |    =  0. 

k  —•00 

Proof:  Assume  that  the  conclusion  of  the  theorem  is  false.    Then  there  exists  an  f  >0  such 
that    I    I  Z*^£?*   I    I    >  f  for  alH-  .    We  have  that 

Since  /    is  bounded  below,  by  taking  the  sums  of  both  sides  of  the  equation,  we  obtain  that 
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*  =0  *  =0 

Since  the  assumption  is  that    |    |Z*    j*|    |    >€>0  for  all  k  ,  and  rj*  >0  is  guaranteed  by  the 

method  of  computing  the  modified  Choiesky  decomposition,  we  have 

lim   a*   =  0. 

i-oo 

However,  this  contradicts  Lemma  4.    Hence  the  assumption  was  false. 
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